Search CORE

Chalmers Research

Differential Privacy and the Fat-Shattering Dimension of Linear Queries

Author: A. Beimel
A. Blum
C. Dwork
C. Dwork
C. Dwork
C. Dwork
K. Nissim
M.J. Kearns
N. Alon
P.L. Bartlett
P.L. Bartlett
Publication venue
Publication date: 01/01/2010
Field of study

In this paper, we consider the task of answering linear queries under the constraint of differential privacy. This is a general and well-studied class of queries that captures other commonly studied classes, including predicate queries and histogram queries. We show that the accuracy to which a set of linear queries can be answered is closely related to its fat-shattering dimension, a property that characterizes the learnability of real-valued functions in the agnostic-learning setting.Comment: Appears in APPROX 201

Lower bounds in differential privacy

Author: C. Dwork
C. Dwork
P. Erdős
Z. Bar-Yossef
Publication venue
Publication date: 21/12/2011
Field of study

This is a paper about private data analysis, in which a trusted curator holding a confidential database responds to real vector-valued queries. A common approach to ensuring privacy for the database elements is to add appropriately generated random noise to the answers, releasing only these {\em noisy} responses. In this paper, we investigate various lower bounds on the noise required to maintain different kind of privacy guarantees.Comment: Corrected some minor errors and typos. To appear in Theory of Cryptography Conference (TCC) 201

Distributed Private Heavy Hitters

Author: A. Beimel
A. Gupta
C. Dwork
C. Dwork
J. Hsu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper, we give efficient algorithms and lower bounds for solving the heavy hitters problem while preserving differential privacy in the fully distributed local model. In this model, there are n parties, each of which possesses a single element from a universe of size N. The heavy hitters problem is to find the identity of the most common element shared amongst the n parties. In the local model, there is no trusted database administrator, and so the algorithm must interact with each of the

n

parties separately, using a differentially private protocol. We give tight information-theoretic upper and lower bounds on the accuracy to which this problem can be solved in the local model (giving a separation between the local model and the more common centralized model of privacy), as well as computationally efficient algorithms even in the case where the data universe N may be exponentially large

On the relation between Differential Privacy and Quantitative Information Flow

Author: A. Ghosh
A. McIver
B. Köpf
C. Braun
C. Braun
C. Dwork
C. Dwork
C. Dwork
C. Dwork
G. Smith
J. Heusser
K. Chatzikokolakis
M. Boreale
M.E. Andrés
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Differential privacy is a notion that has emerged in the community of statistical databases, as a response to the problem of protecting the privacy of the database's participants when performing statistical queries. The idea is that a randomized query satisfies differential privacy if the likelihood of obtaining a certain answer for a database

x

is not too different from the likelihood of obtaining the same answer on adjacent databases, i.e. databases which differ from

x

for only one individual. Information flow is an area of Security concerned with the problem of controlling the leakage of confidential information in programs and protocols. Nowadays, one of the most established approaches to quantify and to reason about leakage is based on the R\'enyi min entropy version of information theory. In this paper, we analyze critically the notion of differential privacy in light of the conceptual framework provided by the R\'enyi min information theory. We show that there is a close relation between differential privacy and leakage, due to the graph symmetries induced by the adjacency relation. Furthermore, we consider the utility of the randomized answer, which measures its expected degree of accuracy. We focus on certain kinds of utility functions called "binary", which have a close correspondence with the R\'enyi min mutual information. Again, it turns out that there can be a tight correspondence between differential privacy and utility, depending on the symmetries induced by the adjacency relation and by the query. Depending on these symmetries we can also build an optimal-utility randomization mechanism while preserving the required level of differential privacy. Our main contribution is a study of the kind of structures that can be induced by the adjacency relation and the query, and how to use them to derive bounds on the leakage and achieve the optimal utility

INRIA a CCSD electronic archive server

HAL-Polytechnique

Privacy-preserving stream aggregation with fault tolerance

Author: C. Dwork
C. Dwork
C. Dwork
I. Mironov
T.-H. Hubert Chan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/12/2011
Field of study

LNCS v. 7397 entitled: Financial cryptography and data security : 16th International Conference, FC 2012 ... Revised selected papersWe consider applications where an untrusted aggregator would like to collect privacy sensitive data from users, and compute aggregate statistics periodically. For example, imagine a smart grid operator who wishes to aggregate the total power consumption of a neighborhood every ten minutes; or a market researcher who wishes to track the fraction of population watching ESPN on an hourly basis. We design novel mechanisms that allow an aggregator to accurately estimate such statistics, while offering provable guarantees of user privacy against the untrusted aggregator. Our constructions are resilient to user failure and compromise, and can efficiently support dynamic joins and leaves. Our constructions also exemplify the clear advantage of combining applied cryptography and differential privacy techniques. © 2012 Springer-Verlag.postprin

Cryptology ePrint Archive

HKU Scholars Hub

Broadening the scope of Differential Privacy Using Metrics ⋆

Author: C. Dwork
C. Dwork
C. Dwork
C.A. Ardagna
H. Lam
M. Duckham
Publication venue
Publication date: 01/01/2013
Field of study

Abstract. Differential Privacy is one of the most prominent frameworks used to deal with disclosure prevention in statistical databases. It provides a formal privacy guarantee, ensuring that sensitive information relative to individuals cannot be easily inferred by disclosing answers to aggregate queries. If two databases are adjacent, i.e. differ only for an individual, then the query should not allow to tell them apart by more than a certain factor. This induces a bound also on the distinguishability of two generic databases, which is determined by their distance on the Hamming graph of the adjacency relation. In this paper we explore the implications of differential privacy when the indistinguishability requirement depends on an arbitrary notion of distance. We show that we can naturally express, in this way, (protection against) privacy threats that cannot be represented with the standard notion, leading to new applications of the differential privacy framework. We give intuitive characterizations of these threats in terms of Bayesian adversaries, which generalize two interpretations of (standard) differential privacy from the literature. We revisit the well-known results stating that universally optimal mechanisms exist only for counting queries: We show that, in our extended setting, universally optimal mechanisms exist for other queries too, notably sum, average, and percentile queries. We explore various applications of the generalized definition, for statistical databases as well as for other areas, such that geolocation and smart metering.

INRIA a CCSD electronic archive server

Macquarie University ResearchOnline

HAL-Polytechnique

An Improved Private Mechanism for Small Databases

Author: A Gupta
C Dwork
C Dwork
J Bourgain
M Grötschel
ML Overton
T-H Hubert Chan
Publication venue
Publication date: 01/05/2015
Field of study

We study the problem of answering a workload of linear queries

\mathcal{Q}

, on a database of size at most

n = o(|\mathcal{Q}|)

drawn from a universe

\mathcal{U}

under the constraint of (approximate) differential privacy. Nikolov, Talwar, and Zhang~\cite{NTZ} proposed an efficient mechanism that, for any given

\mathcal{Q}

and

n

, answers the queries with average error that is at most a factor polynomial in

\log |\mathcal{Q}|

and

\log |\mathcal{U}|

worse than the best possible. Here we improve on this guarantee and give a mechanism whose competitiveness ratio is at most polynomial in

\log n

and

\log |\mathcal{U}|

, and has no dependence on

|\mathcal{Q}|

. Our mechanism is based on the projection mechanism of Nikolov, Talwar, and Zhang, but in place of an ad-hoc noise distribution, we use a distribution which is in a sense optimal for the projection mechanism, and analyze it using convex duality and the restricted invertibility principle.Comment: To appear in ICALP 2015, Track

Take it or Leave it: Running a Survey when Privacy Comes at a Cost

Author: C. Dwork
C. Dwork
Y. Chen
Publication venue: ScholarlyCommons
Publication date: 01/01/2012
Field of study

In this paper, we consider the problem of estimating a potentially sensitive (individually stigmatizing) statistic on a population. In our model, individuals are concerned about their privacy, and experience some cost as a function of their privacy loss. Nevertheless, they would be willing to participate in the survey if they were compensated for their privacy cost. These cost functions are not publicly known, however, nor do we make Bayesian assumptions about their form or distribution. Individuals are rational and will misreport their costs for privacy if doing so is in their best interest. Ghosh and Roth recently showed in this setting, when costs for privacy loss may be correlated with private types, if individuals value differential privacy, no individually rational direct revelation mechanism can compute any non-trivial estimate of the population statistic. In this paper, we circumvent this impossibility result by proposing a modified notion of how individuals experience cost as a function of their privacy loss, and by giving a mechanism which does not operate by direct revelation. Instead, our mechanism has the ability to randomly approach individuals from a population and offer them a take-it-or-leave-it offer. This is intended to model the abilities of a surveyor who may stand on a street corner and approach passers-by

Caltech Authors

ScholarlyCommons@Penn

Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds

Author: A Beimel
A Blum
A De
C Dwork
C Dwork
G Tardos
Jack Murtagh
T van Erven
Publication venue
Publication date: 06/05/2016
Field of study

"Concentrated differential privacy" was recently introduced by Dwork and Rothblum as a relaxation of differential privacy, which permits sharper analyses of many privacy-preserving computations. We present an alternative formulation of the concept of concentrated differential privacy in terms of the Renyi divergence between the distributions obtained by running an algorithm on neighboring inputs. With this reformulation in hand, we prove sharper quantitative results, establish lower bounds, and raise a few new questions. We also unify this approach with approximate differential privacy by giving an appropriate definition of "approximate concentrated differential privacy.